Hannah Flaherty
2018-06-14
print()
browser()
debug()
head(perfect_dataset)
# A tibble: 6 x 7
id species name age legs weight color
<int> <chr> <chr> <dbl> <int> <int> <chr>
1 1 cat Ada 9 4 12 white, gray
2 2 dog Mobius 11 4 55 black
3 3 dog Mazy 119 4 40 black, white
4 4 cat Bea 7 4 8 black, white
5 5 dog Judge 8 4 75 brown, black
6 6 cat Visitor 15 4 10 gray, white
great for quick look at class of each column and numbers, getting a sense of missing values
summary(perfect_dataset)
id species name age
Min. : 1.00 Length:19 Length:19 Min. : 1.00
1st Qu.: 5.50 Class :character Class :character 1st Qu.: 2.75
Median :10.00 Mode :character Mode :character Median : 6.00
Mean :10.37 Mean : 11.97
3rd Qu.:15.50 3rd Qu.: 8.50
Max. :20.00 Max. :119.00
legs weight color
Min. :0.000 Min. : 2.00 Length:19
1st Qu.:4.000 1st Qu.:10.00 Class :character
Median :4.000 Median :15.00 Mode :character
Mean :3.684 Mean :25.06
3rd Qu.:4.000 3rd Qu.:40.00
Max. :4.000 Max. :75.00
NA's :2
not so helpful for character fields
Sometimes you just want to look at your data!
View(perfect_dataset)
But what if it's too much data?
View(world_cities_pop %>% group_by(Country) %>% summarise(num_cities = length(City)))
And you don't have to save it in your environment
Verify all assertions about your dataset.
“All animals in this dataset are mammals.”
perfect_dataset %>%
group_by(species) %>%
summarise(length(species))
# A tibble: 3 x 2
species `length(species)`
<chr> <int>
1 cat 10
2 dog 8
3 snake 1
Almost…
After your manipulation, does it look right? Does it make sense with what you were expecting?
perfect_dataset %>%
group_by(species) %>%
summarise(mean(age))
# A tibble: 3 x 2
species `mean(age)`
<chr> <dbl>
1 cat 6.9000
2 dog 19.6875
3 snake 1.0000
Wait, what?
# A tibble: 6 x 7
id species name age legs weight color
<int> <chr> <chr> <dbl> <int> <int> <chr>
1 3 dog Mazy 119 4 40 black, white
2 6 cat Visitor 15 4 10 gray, white
3 2 dog Mobius 11 4 55 black
4 1 cat Ada 9 4 12 white, gray
5 19 cat Henry 9 4 15 black, white
6 5 dog Judge 8 4 75 brown, black
. . . .
perfect_dataset %>%
group_by(species) %>%
summarise(mean(legs))
# A tibble: 3 x 2
species `mean(legs)`
<chr> <dbl>
1 cat 3.900
2 dog 3.875
3 snake 0.000